Skip to content

Conversation

@mitya52
Copy link
Member

@mitya52 mitya52 commented Apr 14, 2025

No description provided.

@mitya52 mitya52 requested a review from JegernOUTT April 14, 2025 18:41
@mitya52 mitya52 merged commit 92f83ab into dev Apr 15, 2025
7 checks passed
@mitya52 mitya52 deleted the completion-rag-budget-14-04-25 branch April 15, 2025 08:52
MarcMcIntosh added a commit that referenced this pull request Apr 15, 2025
* fix: limit the number of tokens a chat can use.

* fix: linter errors.

* fix: dispatch information callout and disable input.

* fix: missing dep issue.

* refactor: usage total.

* add custom build workflow

* Update README.md

* Update supported-models.md

Added a current model list & polished descriptions.

* Loop limits (#639)

* un-disable input when limit reached.

* chore: add `compression_strength` to tool messages

* add paused state to thread

* add hook to pause auto send based on compression.

* ui: let the user know that their chat is being compressed.

* fix: linter issues after removing `limitReached` information call out.

* fix: also use `/links` to decided if a new chat should be suggested.

* refactor: remove `useTotalTokenUsage` hook.

* add comments about `newChatSuggested`.

* pause and unpaused using newChatSuggested.

* fix(NewChatSuggested): use a hook to get the compression strength.

* feat: add second condition for pausing the chat.

* case: it might be posable for many messages with out compression.

* feat: showing a placeholder message to the user if history is empty

* fix: intercepting currentUsage when choices are empty array and upserting them to last assistant message

* refactor: change telemetry from macro to middleware

* refactor: /ping and /caps handlers

* refactor: code_completion_post_validate don't consume CodeCompletionPost

* chore: remove ScratchError.to_response unused function
it was only used in telemetry, and it was replaced by IntoResponse

* fix: add back "client will see" message

* doc: clarification about extension in scratch error into_response

* wip: remove attach file checkbox. (#644)

* wip: remove attach file checkbox.

* feat: attach files button.

* test: active file is no longer attached by default.

* add an event for the ide to attach a file to chat.

* fix: remove attached files after submit.

* Add timeout to aiohttp session and handle JSON parsing errors

* Extend gradient_type to support new range and behavior

* Add logging for context and token limits in history pruning function.

* Attach preview (#648)

* fix: auto attach file.

* add attached files to command preview request.

* test: attach file on new chat test

* change the ui for compressed messages and fix shared loading states.

* fix: unused variable.

* fix: linting issues

* fix: getting fresh port value on each polling iteration & dispatching resetApiState()

* chore: removed empty line & excessive comment

* fix: better ping polling control

* fix: simplified pingApi

* fix: set body class when appearance changes. (#653)

* 3rdparty setup 09 03 25 (#657)

* init

* fixes

* remove redirect

* next

* models from litellm info

* fetch providers and models

* providers improvements

* add model modal changes

* 3rd party model setup fixes

* api fixes

* renames

* api refactor

* dont show Available Chat Models

* fix adding models

* some fixes

* update api

* ann enabled to backend, fixes

* remove unneded code

* fixes

* inference wip

* UI/models cleanup

* available models

* thirdparty utils

* dont use passthrough minidb

* refactor

* caps rebuild wip

* finetune_info is always list

* thirdparty models in caps

* small fixes

* another fixes

* spad config in model db

* tokenizer endpoints

* telemetry_endpoints

* caps v2 parsing

* update migration from v2 to internal v1

* prompt based chat

* embeddings -> embedding

* fix embeddings

* embeddings fix

* model config instread of str

* ui support buttons

* all providers

* show providers without models at all

* optional api key

* custom config

* ThirdPartyModel improvement

* no custom models for providers

* custom model config

* custom model for no-predefined providers only

* update model fixes

* api base

* fixes

* inference name

* api key is not required

* api key or dummy

* refactor of api

* to dict

* another dict conversion

* UI rework

* apiConfig update

* continue refactor

* continue refactor

* remove custom badge

* UI fixes

* badges fix

* fix provider name

* api keys UI

* show api keys

* remove api_keys from api

* remove api_keys from UI

* add cutom provider

* fixes in chat/completions

* remove provider logic

* remove unneded stuff

* remove commented old style api keys

* validation initial

* caps validation fixes

* continue

* hide api base for non custom providers

* container for extended configuration

* api key selection fixes

* collapsable advanced options

* initial tokenizers API

* ui initial

* implement tokenizer upload and get

* create tokenizers dir

* update tokenizer ui

* move thirdparty from hamburger

* fix twice click

* api enhance

* fixes

* fix api

* fix tokenizers modal

* tokenizer select in model

* get tokenizer logic

* 3rdparty utils refactor

* refactor tokenizers api

* fix upload

* default tokenizers

* ui tokenizers defaults and unploaded

* rework tokenizers section

* move scc into .css file

* enhance UI, wip

* tokenizers UI improve

* update setup, up version

* required tokenizer

* migration

* fix circular imports

* migration fixes

* another fix

* another fix 2

* migration fixes 3

* reasoning params and migration

* reasoning UI

* default tokenizer

* oops

* add custom build workflow

* remove deprecated models

* fix setup.py

* ui improvements

* model name for custom

* backend validation

* fixes

* model name validation

* gpt4o at the top

* set tokenizer if has default

* up caps version if 3rdparty updated

* remove old n_ctx arg

* static caps data

* fix caps

* backward compat with prev lsp versions

* up lsp version due to new style caps for server support

* fix parse version call

* add missed customization and default system prompt

* embedding models args

---------

Co-authored-by: Kirill Starkov <[email protected]>

* fix: merge usage among several messages

* move model record logic to model assigner (#658)

* fix: kill subprocesses and childs from them if future is cancelled, so cancellation of request does not lead to zombie processes

* fix: update tokio-tar by changing it to astral-tokio-tar

* feat: Dockerfile for staticly linked release build

* gemini 25 pro and chatgpt 4o

* New models 07 04 25 (#664)

* gemini 25 pro and chatgpt 4o

* chatgpt4o has no tools

* fix: increase python lsp runner buffer limit

* fix: raise limit to 64MB

* fix: use ArchiveBuilder to preserve permissions

* fix: limit the number of tokens a chat can use.

 Last command done (1 command done):
    pick ab05f48 fix: limit the number of tokens a chat can use.
 Next commands to do (6 remaining commands):
    pick 405bf45 fix: linter errors.
    pick 2098b4f fix: dispatch information callout and disable input.
 You are currently rebasing branch 'fix-conflicts' on '8f8a0078'.
 Changes to be committed:
	modified:   refact-agent/gui/src/components/ChatForm/ChatForm.tsx
	modified:   refact-agent/gui/src/hooks/index.ts
	modified:   refact-agent/gui/src/hooks/useSendChatRequest.ts
	new file:   refact-agent/gui/src/hooks/useTotalTokenUsage.ts

* fix: specify custom address url for the container

* fix: remove usage limits ui.

 interactive rebase in progress; onto 9a4baf9
 Last commands done (36 commands done):
    pick a7e1172 fix: specify custom address url for the container
    pick 86ec25a fix: remove usage limits ui.
 Next commands to do (10 remaining commands):
    pick 72430e5 enable compression button
    pick 3470d3c add forceReload event on any tool result. (#673)
 You are currently rebasing branch 'dev+main-150425' on '9a4baf93'.

 Changes to be committed:
	modified:   refact-agent/gui/src/components/ChatForm/ChatForm.tsx
	modified:   refact-agent/gui/src/hooks/index.ts
	deleted:    refact-agent/gui/src/hooks/useTotalTokenUsage.ts

* enable compression button

* add forceReload event on any tool result. (#673)

* add forceReload event on any tool result.

* rename forceReload event

* chore: rename forceReload

* fix: compression stop, only stop the chat when it's compressed and more than 40 messages from the last user message.

* fix: combine comppression and chat suggestion use similar logic.

* enable send if the user dimisses the request to start a new chat

* fix: open chat suggestion box when restoring a large chat.

* fix: tour link colour in light mode (#676)

* fix: don't fail on confirmation to send error to the model

* refact, starcoder2, deepseek-coder deprecation (#674)

* n_ctx from model assigner (#677)

* n_ctx from model assigner

* models_dict_patch

* fix missed fields and patch pass

* add gpt41 to known models (#679)

* fix: cat tool in windows abs paths, try last colon, and not fail if there is not a line range

* recompute rag completion tokens if out-of-budget (#678)

* fix: more robust json parse for follow ups and locate

---------

Co-authored-by: Kirill Starkov <[email protected]>
Co-authored-by: Awantika <[email protected]>
Co-authored-by: bystrakowa <[email protected]>
Co-authored-by: alashchev17 <[email protected]>
Co-authored-by: MDario123 <[email protected]>
Co-authored-by: JegernOUTT <[email protected]>
Co-authored-by: Dimitry Ageev <[email protected]>
Co-authored-by: Humberto Yusta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants